Deep neural network training emphasizing central frames

نویسندگان

  • Gakuto Kurata
  • Daniel Willett
چکیده

It is common practice to concatenate several consecutive frames of acoustic features as input of a Deep Neural Network (DNN) for speech recognition. A DNN is trained to map the concatenated frames as a whole to the HMM state corresponding to the center frame while the side frames close to both ends of the concatenated frames and the remaining central frames are treated as equally important. Though the side frames are relevant to the HMM state of the center frame, this relationship may not be fully generalized to unseen data. Thus putting more emphasis on the central frames than on the side frames avoids overfitting to the DNN training data. We propose a new DNN training method to emphasize the central frames. We first conduct pre-training and fine-tuning with only the central frames and then conduct fine-tuning with all of the concatenated frames. In large vocabulary continuous speech recognition experiments with more than 1,000 hours of data for DNN training, we obtained a relative error rate reduction of 1.68%, which was statistically significant.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

We describe a simple but effective way of using multi-frame targets to improve the accuracy of Artificial Neural NetworkHidden Markov Model (ANN-HMM) hybrid systems. In this approach a Deep Neural Network (DNN) is trained to predict the forced-alignment state of multiple frames using a separate softmax unit for each of the frames. This is in contrast to the usual method of training a DNN to pre...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

Provide a Deep Convolutional Neural Network Optimized with Morphological Filters to Map Trees in Urban Environments Using Aerial Imagery

Today, we cannot ignore the role of trees in the quality of human life, so that the earth is inconceivable for humans without the presence of trees. In addition to their natural role, urban trees are also very important in terms of visual beauty. Aerial imagery using unmanned platforms with very high spatial resolution is available today. Convolutional neural networks based deep learning method...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015